{ STATISTICS ) 


Chapter 


15 


“Statistics may be rightly called the science of averages and their 
estimates.” - A.L.BOWLEY & A.L. BODDINGTON ❖ 


15.1 Introduction 

We know that statistics deals with data collected for specific 
purposes. We can make decisions about the data by 
analysing and interpreting it. In earlier classes, we have 
studied methods of representing data graphically and in 
tabular form. This representation reveals certain salient 
features or characteristics of the data. We have also studied 
the methods of finding a representative value for the given 
data. This value is called the measure of central tendency. 

Recall mean (arithmetic mean), median and mode are three 
measures of central tendency. A measure of central 
tendency gives us a rough idea where data points are 
centred. But, in order to make better interpretation from the 
data, we should also have an idea how the data are scattered or how much they are 
bunched around a measure of central tendency. 

Consider now the runs scored by two batsmen in their last ten matches as follows: 

Batsman A: 30,91, 0,64,42, 80,30,5, 117, 71 
Batsman B : 53,46,48, 50, 53,53, 58,60, 57,52 
Clearly, the mean and median of the data are 

Batsman A Batsman B 

Mean 53 53 

Median 53 53 

Recall that, we calculate the mean of a data (denoted by %) by dividing the sum 
of the observations by the number of observations, i.e., 



Karl Pearson 
(1857-1936) 
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Also, the median is obtained by first arranging the data in ascending or descending 
order and applying the following rule. 


If the number of observations is odd, then the median is 


n +1 


observation. 


If the number of observations is even, then median is the mean of 



and 



observations. 


We find that the mean and median of the runs scored by both the batsmen A and 
B are same i.e., 53. Can we say that the performance of two players is same? Clearly 
No, because the variability in the scores of batsman A is from 0 (minimum) to 117 
(maximum). Whereas, the range of the runs scored by batsman B is from 46 to 60. 

Let us now plot the above scores as dots on a number line. We find the following 
diagrams: 


For batsman A 


<■ 


0 10 20 30 40 50 60 70 80 90 100 110 120 




For batsman B 


Fig 15.1 


< -* —-♦-- - - ► —. - , - . - ■-.-.-.-■ - ■ - > 

0 10 20 30 40 50 60 70 80 90 100 110 120 

Fig 15.2 

We can see that the dots corresponding to batsman B are close to each other and 
are clustering around the measure of central tendency (mean and median), while those 
corresponding to batsman A are scattered or more spread out. 

Thus, the measures of central tendency are not sufficient to give complete 
information about a given data. Variability is another factor which is required to be 
studied under statistics. Like ‘measures of central tendency ’ we want to have a 
single number to describe variability. This single number is called a ‘measure of 
dispersion ’. In this Chapter, we shall learn some of the important measures of dispersion 
and their methods of calculation for ungrouped and grouped data. 
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15.2 Measures of Dispersion 

The dispersion or scatter in a data is measured on the basis of the observations and the 
types of the measure of central tendency, used there. There are following measures of 
dispersion: 

(i) Range, (ii) Quartile deviation, (iii) Mean deviation, (iv) Standard deviation. 

In this Chapter, we shall study all of these measures of dispersion except the 
quartile deviation. 

15.3 Range 

Recall that, in the example of runs scored by two batsmen A and B, we had some idea 
of variability in the scores on the basis of minimum and maximum runs in each series. 
To obtain a single number for this, we find the difference of maximum and minimum 
values of each series. This difference is called the ‘Range’ of the data. 

Incase of batsman A, Range = 117-0= 117 and for batsman B, Range = 60-46= 14. 
Clearly, Range ofA> Range of B. Therefore, the scores are scattered or dispersed in 
case of A while for B these are close to each other. 

Thus, Range of a series = Maximum value - Minimum value. 

The range of data gives us a rough idea of variability or scatter but does not tell 
about the dispersion of the data from a measure of central tendency. For this purpose, 
we need some other measure of variability. Clearly, such measure must depend upon 
the difference (or deviation) of the values from the central tendency. 

The important measures of dispersion, which depend upon the deviations of the 
observations from a central tendency are mean deviation and standard deviation. Let 
us discuss them in detail. 

15.4 Mean Deviation 

Recall that the deviation of an observation x from a fixed value ‘a ’ is the difference 
x-a. In order to find the dispersion of values ofx from a central value ‘a ’ , we find the 
deviations about a. An absolute measure of dispersion is the mean of these deviations. 
To find the mean, we must obtain the sum of the deviations. But, we know that a 
measure of central tendency lies between the maximum and the minimum values of 
the set of observations. Therefore, some of the deviations will be negative and some 
positive. Thus, the sum of deviations may vanish. Moreover, the sum of the deviations 
from mean (x ) is zero. 

Sum of deviations _ 0 _ n 

Also Mean of deviations = Number of observations 

Thus, finding the mean of deviations about mean is not of any use for us, as far 
as the measure of dispersion is concerned. 







350 MATHEMATICS 


Remember that, in finding a suitable measure of dispersion, we require the distance 
of each value from a central tendency or a fixed number ‘a’. Recall, that the absolute 
value of the difference of two numbers gives the distance between the numbers when 
represented on a number line. Thus, to find the measure of dispersion from a fixed 
number ‘a’ we may take the mean of the absolute values of the deviations from the 
central value. This mean is called the ‘ mean deviation' . Thus mean deviation about a 
central value ‘a’ is the mean of the absolute values of the deviations of the observations 
from ‘a’. The mean deviation from ‘a’ is denoted as M.D. (a). Therefore, 

Sum of absolute values of deviations from 'a' 

M.D.(a) Number of observations 

Mean deviation may be obtained from any measure of central tendency. 
However, mean deviation from mean and median are commonly used in statistical 
studies. 

Let us now leam how to calculate mean deviation about mean and mean deviation 
about median for various types of data 

15.4.1 Mean deviation for ungrouped data Let n observations be x p x 2 , x 3 ,...., x n . 
The following steps are involved in the calculation of mean deviation about mean or 
median: 

Step Calculate the measure of central tendency about which we are to find the mean 
deviation. Let it be ‘a’. 

Step 2 Find the deviation of each x. from a, i.e., jc, - a, x 2 - a, x 3 - a,. . . , x — a 
Step 3 Find the absolute values of the deviations, i.e., drop the minus sign (-), if it is 
there, i.e., |x t — a|,|x 2 — a|,|x 3 -a|,....,|x n —a\ 

Find the mean of the absolute values of the deviations. This mean is the mean 
deviation about a, i.e., 

n 

M.D.(a) = -id- 

n 

1 " 

Thus M.D. ( x ) =— T]x,- -x\ , where x = Mean 

« i=i 

1 n 

and M.D. (M) =— Y'lx,- - Ml, where M = Median 

n i= i 
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< In this Chapter, we shall use the symbol M to denote median unless stated 
otherwise.Let us now illustrate the steps of the above method in following examples. 


Example 1 Find the mean deviation about the mean for the following data: 

6,7, 10, 12, 13,4,8, 12 

Solutioi We proceed step-wise and get the following: 

Step Mean of the given data is 


_ 6 + 7 + 10 + 12 + 13 + 4- 

x - - 


±2 = 72 = 9 


The deviations of the respective observations from the mean x, i.e., x.-x are 
6 - 9, 7 - 9, 10 - 9, 12 - 9, 13 - 9, 4 - 9, 8 - 9, 12 - 9, 
or -3,-2, 1, 3, 4,-5,-1, 3 

The absolute values of the deviations, i.e., x ( - -x| are 
3,2, 1,3, 4, 5, 1,3 

Step The required mean deviation about the mean is 




M.D. (x) = - 


=i 


3 + 2 + 1 + 3 + 4 + 5 + 1 + 3 _ 22 _ 2 ?5 


Instead of carrying out the steps every time, we can carry on calculation, 
step-wise without referring to steps. 


Example 2 Find the mean deviation about the mean for the following data : 
12,3,18,17,4,9,17,19,20,15,8,17,2,3, 16, 11,3,1,0,5 

Solution We have to first find the mean (x ) of the given data 


200 


1 l u 

X = — Vx = 

20tf 20 


10 


The respective absolute values of the deviations from mean, i.e., \x i — x are 
2,7, 8 ,7, 6 , 1,7, 9,10, 5,2, 7, 8 ,7, 6 ,1, 7, 9, 10, 5 
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20 

Therefore = ^4 

i=i 

124 

and M.D. (x ) = —- = 6.2 


Example Find the mean deviation about the median for the following data: 

3,9,5,3,12, 10,18,4,7, 19,21. 

Flere the number of observations is 11 which is odd. Arranging the data into 
ascending order, we have 3, 3,4, 5, 7, 9, 10, 12, 18, 19,21 


Now 


Median = 


11 + 1 


or 6 th observation = 9 


The absolute values of the respective deviations from the median, i.e., |x, - M| are 
6 ,6, 5,4,2,0, 1,3,9, 10,12 


Therefore 


and 


li 

£|x,.-M| = 58 

;=i 

1 11 1 

M.D. (M) = — V lx, -M| = — x 58 = 5.27 
V ' 11 fl' ' 'll 


15.4.2 Mean deviation for grouped data We know that data can be grouped into 
two ways : 

(a) Discrete frequency distribution, 

(b) Continuous frequency distribution. 

Let us discuss the method of finding mean deviation for both types of the data. 

(a) Discrete frequency distribution Let the given data consist of n distinct values 
Xj, x 2 , ..., x n occurring with frequencies j \, f , —,f n respectively. This data can be 
represented in the tabular form as given below, and is called discrete frequency 
distribution : 

x : x, x. x, ... x 

12 3 n 

f-f ; f 2 / 3 -/„ 

(i) Mean deviation about mean 

First of all we find the mean y of the given data by using the formula 
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n 



i=l 


n 

where ^ x i f i denotes the sum of the products of observations x. with their respective 

i=i 

n 

frequencies/^ and N = ^ f is the sum of the frequencies. 

!=1 

Then, we find the deviations of observations x from the mean x and take their 
absolute values, i.e., |x ; -x| for all i =1, 2 ,..., n. 

After this, find the mean of the absolute values of the deviations, which is the 
required mean deviation about the mean. Thus 

n 

Y*fi\ X l ~ A l n 
M.D.(x) = ^--- = I 

Zf 

1=1 

(ii) Mean deviation about median To find mean deviation about median, we find the 
median of the given discrete frequency distribution. For this the observations are arranged 
in ascending order. After this the cumulative frequencies are obtained. Then, we identify 

N 

the observation whose cumulative frequency is equal to or just greater than —, where 

N is the sum of frequencies. This value of the observation lies in the middle of the data, 
therefore, it is the required median. After finding median, we obtain the mean of the 
absolute values of the deviations from median.Thus, 

M.D.(M) = -f X/j|x,-M| 

1=1 

Example - Find mean deviation about the mean for the following data : 
x. 2 5 6 8 10 12 

l 

f. 2 8 10 7 8 5 

Solution Let us make a Table 15.1 of the given data and append other columns after 
calculations. 








354 MATHEMATICS 


Table 15.1 



N = t/, =40, ’Zf\x-x 1=92 

1=1 1=1 1=1 

1 6 1 

Therefore x = fi x t ~~^q X = ^ 

1 6 1 

and M.D. (x) = — Y f |jc,-jc| = — x 92 = 2.3 

N tr 40 

inple 5 Find the mean deviation about the median for the following data: 


X. 

1 

3 

6 

9 

12 

13 

15 

21 

22 

ft 

3 

4 

5 

2 

4 

5 

4 

3 


Solution The given observations are already in ascending order. Adding a row 
corresponding to cumulative frequencies to the given data, we get (Table 15.2). 


Table 15.2 


X. 

1 

3 

6 

9 

12 

13 

15 

21 

22 

ft 

3 

4 

5 

2 

4 

5 

4 

3 

c.f 

3 

7 

12 

14 

18 

23 

27 

30 


Now, N=30 which is even. 
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Median is the mean of the 15 th and 16 th observations. Both of these observations 
lie in the cumulative frequency 18, for which the corresponding observation is 13. 


Therefore, Median M = 


15 th observation + 16 th observation 13 + 13 


= 13 


Now, absolute values of the deviations from median, i.e., \x j - M are shown in 
Table 15.3. 

Table 15.3 



*i~ M| 

10 

7 

4 

1 

0 

2 

8 

9 

fi 

3 

4 

5 

2 

4 

5 

4 

3 

fi 

x, -M 

30 

28 

20 

2 

0 

10 

32 

27 


8 8 


We have 

ZX=30 

i= 1 

and ^ f x ; .-M = 149 

1=1 

Therefore 

M. D. (M) = 

2 

i 

K~ 

SI 

Hz 


= — xl49 =4.97. 

30 

(b) Continuous frequency distribution A continuous frequency distribution is a series 
in which the data are classified into different class-intervals without gaps alongwith 
their respective frequencies. 

For example, marks obtained by 100 students are presented in a continuous 
frequency distribution as follows : 


Marks obtained 

0-10 

10-20 

20-30 

30-40 

40-50 

50-60 

Number of Students 

12 

18 

27 

20 

17 

6 


(i) Mean deviation about mean While calculating the mean of a continuous frequency 
distribution, we had made the assumption that the frequency in each class is centred at 
its mid-point. Here also, we write the mid-point of each given class and proceed further 
as for a discrete frequency distribution to find the mean deviation. 

Let us take the following example. 
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Example 6 Find the mean deviation about the mean for the following data. 


Marks obtained 

10-20 

20-30 

30-40 

40-50 

50-60 

60-70 

70-80 

Number of students 

2 

3 

8 

14 

8 

3 

2 


Solution We make the following Table 15.4 from the given data : 


Table 15.4 


Marks 

obtained 

Number of 
students 

/, 

Mid-points 

X. 

1 

fx. 

J l l 

x,. -x| 

f\ x t~ x 1 

10-20 

2 

15 

30 

30 

60 

20-30 

3 

25 

75 

20 

60 

30-40 

8 

35 

280 

10 

80 

40-50 

14 

45 

630 

0 

0 

50-60 

8 

55 

440 

10 

80 

60-70 

3 

65 

195 

20 

60 

70-80 

2 

75 

150 

30 

60 


40 


1800 


400 


Here N = =40,% fa =1800, |x ; -x| =400 

1=1 1=1 1=1 

Therefore 

and MB.(x) = ±'£f i \x i -x\~xm = lO 

Shortcut method for calculating mean deviation about mean We can avoid the 
tedious calculations of computing J by following step-deviation method. Recall that in 
this method, we take an assumed mean which is in the middle or just close to it in the 
data. Then deviations of the observations (or mid-points of classes) are taken from the 


* = = — = 4 5 


1=1 


40 
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assumed mean. This is nothing but the shifting of origin from zero to the assumed mean 
on the number line, as shown in Fig 15.3 

-60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60<<-After 

»»»»»» ^ »»»»»» ^l eviations 

B e f oret o io 20 30 40 50 60 70 80 90 100 110 120 

deviations 

Assumed 

Mean 

Fig 15.3 

If there is a common factor of all the deviations, we divide them by this common 
factor to further simplify the deviations. These are known as step-deviations. The 
process of taking step-deviations is the change of scale on the number line as shown in 
Fig 15.4 

Step 

n . .. , -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 «■ 

Deviations from deviations 

-> -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 

assumed mean ...... ...... 

<-•—<* —«-- --♦-- — — '-'----> 

0 10 20 30 40 50 60 70 80 90 100 110 120 

4 ' 

Assumed 
Mean 
Fig 15.4 

The deviations and step-deviations reduce the size of the observations, so that the 
computations viz. multiplication, etc., become simpler. Let, the new variable be denoted 

x — Cl 

by d i = — - , where ‘a’ is the assumed mean and h is the common factor. Then, the 

h 

mean j by step-deviation method is given by 

x = a + - - xh 

N 

Let us take the data of Example 6 and find the mean deviation by using step- 
deviation method. 
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Take the assumed mean a = 45 and h = 10, and form the following Table 15.5. 


Table 15.5 


Marks 

obtained 

Number of 
students 

Mid-points 

4,^- 45 

10 

fd, 


f\ x t~A 


/, 

X. 

1 





10-20 

2 

15 

-3 

-6 

30 

60 

20-30 

3 

25 

-2 

-6 

20 

60 

30-40 

8 

35 

- 1 

-8 

10 

80 

40-50 

14 

45 

0 

0 

0 

0 

50-60 

8 

55 

1 

8 

10 

80 

60-70 

3 

65 

2 

6 

20 

60 

70-80 

2 

75 

3 

6 

30 

60 


40 



0 


400 


5 f i d i 

Therefore x = a + l =± -x/z 

N 


= 45 + —xlO-45 
40 


and 


1 7 i i 

M.D. (x) = — V f\X: -x = 
N ty 1 ' 1 


400 

40 


= 10 


’Note 


The step deviation method is applied to compute jc • Rest of the procedure 


is same. 


(ii) Mean deviation about median The process of finding the mean deviation about 
median for a continuous frequency distribution is similar as we did for mean deviation 
about the mean. The only difference lies in the replacement of the mean by median 
while taking deviations. 

Let us recall the process of finding median for a continuous frequency distribution. 

The data is first arranged in ascending order. Then, the median of continuous 
frequency distribution is obtained by first identifying the class in which median lies 
(median class) and then applying the formula 
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Median = / + —- x h 

f 

where median class is the class interval whose cumulative frequency is just greater 
N 

than or equal to — , N is the sum of frequencies, l,f h and C are, respectively the lower 

limit, the frequency, the width of the median class and C the cumulative frequency of 
the class just preceding the median class. After finding the median, the absolute values 

of the deviations of mid-point x. of each class from the median i.e., x,- - M| are obtained. 

Then M.D. (M) = - £ f. lx- - m| 

N i =1 

The process is illustrated in the following example: 

Example 7 Calculate the mean deviation about median for the following data : 


Class 

0-10 

10-20 

20-30 

30-40 

40-50 

50-60 

Frequency 

6 

7 

15 

16 

4 

2 


Form the following Table 15.6 from the given data : 

Table 15.6 


Class 

Frequency 

Cumulative 

frequency 

Mid-points 

- Med. 

f \x- - Med.l 

j i\ i 


/, 

(c.f) 

X. 

1 



0-10 

6 

6 

5 

23 

138 

10-20 

7 

13 

15 

13 

91 

20-30 

15 

28 

25 

3 

45 

30-40 

16 

44 

35 

7 

112 

40-50 

4 

48 

45 

17 

68 

50-60 

2 

50 

55 

27 

54 


50 




508 
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The class interval containing — or 25 th item is 20-30. Therefore, 20-30 is the median 
class. We kn ow that 

+ -C 

Median = / -l-x h 

f 

Here / = 20, C = 13,/ = 15, h = 10 and N = 50 

Therefore, Median = 20 + ——— xlO =20 + 8 = 28 

15 

Thus, Mean deviation about median is given by 

1 6 1 

M.D. (M) = — y/k-M| = — x508 = 10.16 

EXERCISE 15.1 

Find the mean deviation about the mean for the data in Exercises 1 and 2. 

1. 4,7,8,9,10,12,13,17 

2. 38,70,48,40,42,55,63,46,54,44 

Find the mean deviation about the median for the data in Exercises 3 and 4. 

3. 13,17,16,14,11,13,10,16,11,18,12,17 

4. 36,72,46,42,60,45,53,46,51,49 

Find the mean deviation about the mean for the data in Exercises 5 and 6 . 


X. 

1 

5 

10 

15 

20 

25 


ft 

7 

4 

6 

3 

5 


6. x. 

1 

10 

30 

50 

70 

90 


ft 

4 

24 

28 

16 

8 


Find the mean deviation about the 

median 

for the data in Exercises 7 and 8 

7. x. 

1 

5 

7 

9 

10 

12 

15 

ft 

8 

6 

2 

2 

2 

6 

8. x. 

1 

15 

21 

27 

30 

35 



3 

5 

6 

7 

8 
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Find the mean deviation about the mean for the data in Exercises 9 and 10. 

9. Income 0-100 100-200 200-300 300-400 400-500 500-600 600-700 700-800 
per day 

Number 489 10 7 543 

of persons 

10. Height 95-105 105-115 115-125 125-135 135-145 145-155 

in cms 

Number of 9 13 26 30 12 10 

boys 

Find the mean deviation about median for the following data : 

Marks 0-10 10-20 20-30 30-40 40-50 50-60 

Number of 6 8 14 16 4 2 

Girls 

i 2. Calculate the mean deviation about median age for the age distribution of 100 
persons given below: 

Age 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55 

Number 5 6 12 14 26 12 16 9 

[Hint Convert the given data into continuous frequency distribution by subtracting 0.5 
from the lower limit and adding 0.5 to the upper limit of each class interval] 

15.4.3 Limitations of mean deviation In a series, where the degree of variability is 
very high, the median is not a representative central tendency. Thus, the mean deviation 
about median calculated for such series can not be fully relied. 

The sum of the deviations from the mean (minus signs ignored) is more than the 
sum of the deviations from median. Therefore, the mean deviation about the mean is 
not very scientific.Thus, in many cases, mean deviation may give unsatisfactory results. 
Also mean deviation is calculated on the basis of absolute values of the deviations and 
therefore, cannot be subjected to further algebraic treatment. This implies that we 
must have some other measure of dispersion. Standard deviation is such a measure of 
dispersion. 

15.5 Variance and Standard Deviation 

Recall that while calculating mean deviation about mean or median, the absolute values 
of the deviations were taken. The absolute values were taken to give meaning to the 
mean deviation, otherwise the deviations may cancel among themselves. 

Another way to overcome this difficulty which arose due to the signs of deviations, 
is to take squares of all the deviations. Obviously all these squares of deviations are 
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non-negative. Let x v x 2 , x y ..., x n be n observations and x be their mean. Then 


n 

(x, xf (x 2 xf . (x n xf (x i xf _ 

i 1 


If this sum is zero, then each (x : — x) has to be zero. This implies that there is no 
dispersion at all as all observations are equal to the mean x . 


n 

if is small , this indicates that the observations jc p x,, x y ...,x n are 

2=1 

close to the mean x and therefore, there is a lower degree of dispersion. On the 
contrary, if this sum is large, there is a higher degree of dispersion of the observations 


from the mean x ■ Can we thus say that the sum ^ ( x , x ) is a reasonable indicator 

i=i 

of the degree of dispersion or scatter? 

Let us take the set A of six observations 5, 15, 25, 35, 45, 55. The mean of the 
observations is x = 30. The sum of squares of deviations from x for this set is 


6 

I(*,-x) 2 = (5-3 0) 2 + (15-30) 2 + (25-3 0) 2 + (35-30) 2 + (45-3 0) 2 +(5 5-3 0) 2 
2=1 

= 625 + 225 + 25 + 25 + 225 + 625 = 1750 
Let us now take another set B of 31 observations 15,16, 17, 18,19,20,21,22,23, 
24, 25,26,27,28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,40,41,42, 43, 44,45. The 
mean of these observations is y = 30 

Note that both the sets A and B of observations have a mean of 30. 

Now, the sum of squares of deviations of observations for set B from the mean y is 
given by 


2>2--T) 2 = (15-30) 2 +(16-3 0) 2 + (17-3 0) 2 + ...+ (44-3 0) 2 +(45-3 0) 2 
2=1 

= (-15) 2 +(-14) 2 + ...+ (-1) 2 + 0 2 + 1 2 + 2 2 + 3 2 + ...+ 14 2 + 15 2 
= 2 [15 2 + 14 2 + ... + l 2 ] 


15 x (15 + 1) (30 + 1) 
6 


= 5 x 16 x 31 =2480 


n (n + 1) {In +1) 

. Here n = 


15) 


(Because sum of squares of first n natural numbers 


6 
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n 

if _ *) 2 is simply our measure of dispersion or scatter about mean, we 

1=1 

will tend to say that the set A of six observations has a lesser dispersion about the mean 
than the set B of 31 observations, even though the observations in set A are more 
scattered from the mean (the range of deviations being from -25 to 25) than in the set 
B (where the range of deviations is from -15 to 15). 

This is also clear from the following diagrams. 

For the set A, we have 

• ••••• 

<-,-,-.-.-,-.-.-,-.-.-,- 1 -.--^ 

0 5 10 15 20 25 30 35 40 45 50 55 60 

* 

Mean 
Fig 15.5 

For the set B, we have 


«-'-'-' - < - * - '---'-'-■ ■ ■ ■—'-'-> 

0 5 10 15 20 25 30 35 40 45 50 55 60 

y 

Mean 
Fig 15.6 

Thus, we can say that the sum of squares of deviations from the mean is not a proper 
measure of dispersion. To overcome this difficulty we take the mean of the squares of 


the deviations, i.e., we take 


1 " 

- X O, -x) . in case of the set A, we have 
n i=i 


1 1 

Mean - — x 1750 = 291.67 and in case of the set B, it is — x 2480 = 80. 

This indicates that the scatter or dispersion is more in set A than the scatter or dispersion 
in set B, which confirms with the geometrical representation of the two sets. 


Thus, we can take ^,-x) 2 as a quantity which leads to a proper measure 

of dispersion. This number, i.e., mean of the squares of the deviations from mean is 
called the variance and is denoted by cr 2 (read as sigma square). Therefore, the 
variance of n observations jc 15 jc 2 ,..., x n is given by 
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a 


2 


n 


Z ( x < _x ) 2 


15.5.1 Standard Deviation In the calculation of variance, we find that the units of 
individual observations x and the unit of their mean x are different from that of variance, 
since variance involves the sum of squares of (x— x )• For this reason, the proper 
measure of dispersion about the mean of a set of observations is expressed as positive 
square-root of the variance and is called standard deviation. Therefore, the standard 
deviation, usually denoted by cr, is given by 


cr = J -' E ( x -- x ) 2 ...( 1 ) 

V n i =i 

Let us take the following example to illustrate the calculation of variance and 
hence, standard deviation of ungrouped data. 

Example 8 Find the Variance of the following data: 

6,8, 10, 12, 14, 16, 18,20,22,24 


Solution From the given data we can form the following Table 15.7. The mean is 
calculated by step-deviation method taking 14 as assumed mean. The number of 
observations is n = 10 


Table 15.7 


X. 

/ 

ii 

j* 
to 1 

£ 

Deviations from mean 
(x-x ) 

(x- x ) 

6 

-4 

-9 

81 

8 

-3 

-7 

49 

10 

-2 

-5 

25 

12 

-1 

-3 

9 

14 

0 

-1 

1 

16 

1 

1 

1 

18 

2 

3 

9 

20 

3 

5 

25 

22 

4 

7 

49 

24 

5 

9 

81 


5 


330 
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Therefore 


I 'A 

Mean x = assumed mean + 

n 


14 + — x2 = 15 
10 


and 


1 i 

Variance ( a 2 ) = “ - x ) = — x 330 = 33 


i=i 


Thus Standard deviation (cr) = ^33 = 5.74 

15.5.2 Standard deviation of a discrete frequency distribution Let the given discrete 
frequency distribution be 

x : x v x 2 , x 3 ,. . . , x n 


f- f v U A 


A 


In this case standard deviation (cr) 




... ( 2 ) 


where N = ^ f ■ 

i=i 

Let us take up following example. 

Find the variance and standard deviation for the following data: 


X. 

1 

4 

8 

11 

17 

20 

24 

32 

f, 

3 

5 

9 

5 

4 

3 

1 


in Presenting the data in tabular form (Table 15.8), we get 

Table 15.8 


X. 

1 

A 

fx. 

J l l 

x- J 

(x,-x) 2 

ff x i -x) 2 

4 

3 

12 

-10 

100 

300 

8 

5 

40 

-6 

36 

180 

11 

9 

99 

-3 

9 

81 

17 

5 

85 

3 

9 

45 

20 

4 

80 

6 

36 

144 

24 

3 

72 

10 

100 

300 

32 

1 

32 

18 

324 

324 


30 

420 



1374 
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7 


7 


N = 30, Z f x i= 420 ’ Z f{ 


X: —X 


) 2 = 1374 


7 


Therefore 



Hence 




1 

= 1374 = 45.8 


and 


Standard deviation (a) = y/45.8 =6.77 


15.5.3 Standard deviation of a continuous frequency distribution The given 
continuous frequency distribution can be represented as a discrete frequency distribution 
by replacing each class by its mid-point. Then, the standard deviation is calculated by 
the technique adopted in the case of a discrete frequency distribution. 

If there is a frequency distribution of n classes each class defined by its mid-point 
x with frequency f., the standard deviation will be obtained by the formula 



n 


where x is the mean of the distribution and N = y\f . 


Another formula for standard deviation We know that 



1 


n 


Z fi x i 2+ Z v 'V;- Z 2x M 
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N 


Y.fi x i +xl 


N-2T.N x 


Here —Vx/'. 
Ntf '■ 1 


n 

- x or ^x, /: = Nx 

<=i 


1 

N 


n 



i =1 


1 

N 



—2 

JC 


or 


s-lip*'- 

' n ^ 

Yafih 

2 

1 

n 

(« V" 

i =1 

NX/* 2 - 
1=1 

Ki=1 7 

N 

N 2 


V ) 





Thus, standard deviation (c) = — -y J ... (3) 

Example ] Calculate the mean, variance and standard deviation for the following 
distribution: 

Class 30-40 40-50 50-60 60-70 70-80 80-90 90-100 

Frequency 3 7 12 15 8 3 2 

in From the given data, we construct the following Table 15.9. 


Table 15.9 


Class 

Frequency 

Mid-point 

(*■) 

fx. 

J 1 l 

(*-x) 2 

f^-x) 2 

30-40 

3 

35 

105 

729 

2187 

40-50 

7 

45 

315 

289 

2023 

50-60 

12 

55 

660 

49 

588 

60-70 

15 

65 

975 

9 

135 

70-80 

8 

75 

600 

169 

1352 

80-90 

3 

85 

255 

529 

1587 

90-100 

2 

95 

190 

1089 

2178 


50 


3100 


10050 
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Thus 


„ - 1 ^ , 3100 

Mean x = — > f,x, - - 

N fr 50 


62 


l 7 

Variance (cr 2 ) = ~ f( x i ~ x ) 


= —x 10050 =201 
50 


and Standard deviation (cr) = V201 = 14.18 

Example 1 Find the standard deviation for the following data 


X. 

1 

3 

8 

13 

18 

23 

f, 

7 

10 

15 

10 

6 


in Let us form the following Table 15.10: 


Table 15.10 



Now, by formula (3), we have 

-T ^48 x 9652 - (614) 2 


— J463296 - 376996 
48 v 
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= —X293.77 =6.12 
48 

Therefore, Standard deviation ( a ) = 6.12 


15.5.4. Shortcut method to find variance and standard deviation Sometimes the 
values of x in a discrete distribution or the mid points x. of different classes in a 
continuous distribution are large and so the calculation of mean and variance becomes 
tedious and time consuming. By using step-deviation method, it is possible to simplify 
the procedure. 


Let the assumed mean be ‘A’ and the scale be reduced toy times (h being the 

n 


width of class-intervals). Let the step-deviations or the new values bey.. 


x - A 

i.e. y, =~ - or x. = A + hv. 

h ' 


... ( 1 ) 


We know that 




N 


Replacing x from (1) in (2), we get 

n 

XXfA + /y) 


N 


... ( 2 ) 


'±f^±kf,yXU*±f,+h±f t y t 

i =i / ^ V 1=1 /=! 


\ 1=1 


because = N 

V i =1 J 


N 

~ A. — + h— - 

N N 

Thus x = A + h y 


2 1 n 

Now Variance of the variable x, G x 

^ 1=1 


... (3) 


= T7 if ( A+h y i ~ A ~ h yf 


N t 


1=1 


(Using (1) and (3)) 
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N 


T f.h 2 (y, - y ) 1 

i =1 


i.e. 




h 2 x variance of the variable y. 


or <j x = h<J v ... (4) 

From (3) and (4), we have 


or 


| N Z^F, 2 - tfiVt 

i=\ V i=i 


... (5) 


Let us solve Example 11 by the short-cut method and using formula (5) 

12 Calculate mean, Variance and Standard Deviation for the following 

distribution. 


Classes 

30-40 

40-50 

50-60 

60-70 

70-80 

80-90 

90-100 

Frequency 

3 

7 

12 

15 

8 

3 

2 


Let the assumed mean A = 65. Here h= 10 
We obtain the following Table 15.11 from the given data : 


Table 15.11 


Class 

Frequency 

Mid-point 

Xf - 65 


f,y, 

fy 2 

J i y i 

10 


/, 

X. 

1 





30-40 

3 

35 

-3 

9 

-9 

27 

40-50 

7 

45 

-2 

4 

- 14 

28 

50-60 

12 

55 

- 1 

1 

- 12 

12 

60-70 

15 

65 

0 

0 

0 

0 

70-80 

8 

75 

1 

1 

8 

8 

80-90 

3 

85 

2 

4 

6 

12 

9 0-100 

2 

95 

3 

9 

6 

18 


N=50 




- 15 

105 
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Therefore 


Variance 


and standard 


x - A + 


£^x/! = 65- —x 10 = 62 
50 50 


h 


2 r 


= N 2 L 


y 


(i or 


(50)" 


50x105 -(-15)" 


= — [5250 - 225] = 201 
25 

deviation (cr) = V201 =14.18 


EXERCISE 15.2 

Find the mean and variance for each of the data in Exercies 1 to 5. 


1. 6,7,10,12,13,4,8,12 

2. First n natural numbers 

3. First 10 multiples of 3 



Find the mean and standard deviation using short-cut method. 


X. 

1 

60 

61 

62 

63 

64 

65 

66 

67 

68 

ft 

2 

1 

12 

29 

25 

12 

10 

4 

5 


Find the mean and variance for the following frequency distributions in Exercises 
7 and 8. 


Classes 

0-30 

30-60 

60-90 

90-120 

120-150 

150-180 

180-210 

Frequencies 

2 

3 

5 

10 

3 

5 

2 
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Classes 

0-10 

10-20 

20-30 

30-40 

40-50 

Frequencies 

5 

8 

15 

16 

6 


9. Find the mean, variance and standard deviation using short-cut method 


Fleight 
in cms 

70-75 

75-80 

80-85 

85-90 

90-95 

95-100 

100-105 

105-110 

110-115 

No. of 
children 

3 

4 

7 

7 

15 

9 

6 

6 

3 


The diameters of circles (in mm) drawn in a design are given below: 


Diameters 

33-36 

37-40 

41-44 

45-48 

49-52 

No. of circles 

15 

17 

21 

22 

25 


Calculate the standard deviation and mean diameter of the circles. 

it First make the data continuous by making the classes as 32.5-36.5, 36.5-40.5, 
40.5-44.5, 44.5 - 48.5, 48.5 - 52.5 and then proceed.] 

15.6 Analysis of Frequency Distributions 

In earlier sections, we have studied about some types of measures of dispersion. The 
mean deviation and the standard deviation have the same units in which the data are 
given. Whenever we want to compare the variability of two series with same mean, 
which are measured in different units, we do not merely calculate the measures of 
dispersion but we require such measures which are independent of the units. The 
measure of variability which is independent of units is called coefficient of variation 
(denoted as C.V.) 

The coefficient of variation is defined as 

C.V. = ?xl00 , x*0, 

x 

where ct and x are the standard deviation and mean of the data. 

For comparing the variability or dispersion of two series, we calculate the coefficient 
of variance for each series. The series having greater C.V. is said to be more variable 
than the other. The series having lesser C.V. is said to be more consistent than the 
other. 
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15.6.1 Comparison of two frequency distributions with same mean Let jq and o, 

be the mean and standard deviation of the first distribution, and x, and a, be the 
mean and standard deviation of the second distribution. 


Then 


C.V. (1st distribution) = x 100 

x i 


and C.V. (2nd distribution) = x 100 

Given = x 2 = x (say) 

Therefore C.V. (1st distribution) = ^-xlOO ... (1) 

and C.V. (2nd distribution) = - 3 -xlOO ■••(2) 

v 

It is clear from (1) and (2) that the two C.Vs. can be compared on the basis of values 
of <j 1 and cr 2 only. 

Thus, we say that for two series with equal means, the series with greater standard 
deviation (or variance) is called more variable or dispersed than the other. Also, the 
series with lesser value of standard deviation (or variance) is said to be more consistent 
than the other. 

Let us now take following examples: 

Example 13 Two plants A and B of a factory show following results about the number 
of workers and the wages paid to them. 



A 

B 

No. of workers 

5000 

6000 

Average monthly wages 

Rs 2500 

Rs 2500 

Variance of distribution 

81 

100 

of wages 




In which plant, A or B is there greater variability in individual wages? 

The variance of the distribution of wages in plant A( <j 2 ) = 81 
Therefore, standard deviation of the distribution of wages in plant A (Oq) = 9 
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Also, the variance of the distribution of wages in plant B ( o% 2 ) = 100 

Therefore, standard deviation of the distribution of wages in plant B ( cr, ) = 10 

Since the average monthly wages in both the plants is same, i.e., Rs.2500, therefore, 
the plant with greater standard deviation will have more variability. 

Thus, the plant B has greater variability in the individual wages. 

Example T Coefficient of variation of two distributions are 60 and 70, and their 
standard deviations are 21 and 16, respectively. What are their arithmetic means. 

Solution Given C.V. (1st distribution) = 60, o x = 2l 

C.V. (2nd distribution) = 70, cr, =16 

Let jc, and x, be the means of 1 st and 2nd distribution, respectively. Then 

C.V. (1st distribution) = - x 100 

Xj 

Therefore 60= xlOO or x, = — xlOO = 35 

xj 60 

and C.V. (2nddistribution) = - xlOO 

i.e. 70= xlOO or x, = — xlOO = 22.85 

x 2 ‘ 70 

Example 1 The following values are calculated in respect of heights and weights of 
the students of a section of Class XI: 

Height Weight 

Mean 162.6 cm 52.36 kg 

Variance 127.69 cm 2 23.1361 kg 2 

Can we say that the weights show greater variation than the heights? 

Solution To compare the variability, we have to calculate their coefficients of variation. 
Given Variance of height = 127.69cm 2 

Therefore Standard deviation of height = a/i 27.69cm = 11.3 cm 

Also Variance of weight = 23.1361 kg 2 
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Therefore Standard deviation of weight = V23.1361 kg = 4.81 kg 

Now, the coefficient of variations (C.V.) are given by 


(C.V.) in heights = 


Standard Deviation 

-xlOO 

Mean 


= ——xlOO =6.95 
162.6 

4.81 

and (C.V.) in weights = - xlOO =9.18 

52.36 

Clearly C.V. in weights is greater than the C.V. in heights 
Therefore, we can say that weights show more variability than heights. 


EXERCISE 15.3 

1. From the data given below state which group is more variable, A or B? 


Marks 

10-20 

20-30 

30-40 

40-50 

50-60 

60-70 

70-80 

Group A 

9 

17 

32 

33 

40 

10 

9 

Group B 

10 

20 

30 

25 

43 

15 

7 


From the prices of shares X and Y below, find out which is more stable in value: 


X 

35 

54 

52 

53 

56 

58 

52 

50 

51 

49 

Y 

108 

107 

105 

105 

106 

107 

104 

103 

104 

101 


3 An analysis of monthly wages paid to workers in two firms A and B, belonging to 

the same industry, gives the following results: 



Firm A 

FirmB 

No. of wage earners 

586 

648 

Mean of monthly wages 

Rs 5253 

Rs 5253 

Variance of the distribution 

100 

121 

of wages 




(i) Which firm A or B pays larger amount as monthly wages? 

(ii) Which firm, A or B, shows greater variability in individual wages? 
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The following is the record of goals scored by team A in a football session: 


No. of goals scored 

0 

1 

2 

3 

4 

No. of matches 

1 

9 

7 

5 

3 


For the team B, mean number of goals scored per match was 2 with a standard 
deviation 1.25 goals. Find which team may be considered more consistent? 

5 The sum and sum of squares corresponding to length x (in cm) and weight y 
(in gm) of 50 plant products are given below: 

50 50 50 50 

2><= 212 , X*? = 902.8, X^=261,Xt- =1457.6 

(=1 ;=1 i=l i =1 

Which is more varying, the length or weight? 


Miscellaneous Examples 

Example 16 The variance of 20 observations is 5. If each observation is multiplied by 
2 , find the new variance of the resulting observations. 

Let the observations be x { , x 2 , ..., x 20 and x be their mean. Given that 
variance = 5 and n = 20. We know that 

Y 20 J 20 

Variance (cr 2 ) = — V (x i -x ) 2 , i.e., 5 = — 

v ’ n , =1 zu i=i 

20 

or ~ x ) =100 ■••(!) 

i=i 

If each observation is multiplied by 2, and the new resulting observations arc y , then 

1 

y i = 2 x. i.e., x. = -y, 

^ 20 ^ 20 1 20 

Therefore - v = ~ X T = AT 2x < = 2 ■ To ^ X ‘ 

n i=\ 20 ;=l 20 , =1 

1 _ 

i.e. V = 2 jc or x = “T 

Substituting the values ofx. and x in (1), we get 
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i.e., 


20 


2>,-y) 2 = 4 oo 


1 2 

Thus the variance of new observations = — x 400 -20-2 x 5 

20 


The reader may note that if each observation is multiplied by a constant 


k, the variance of the resulting observations becomes k 2 times the original variance. 


Examplel7 The mean of 5 observations is 4.4 and their variance is 8.24. If three of 
the observations are 1 , 2 and 6 , find the other two observations. 


Solution Let the other two observations be x and y. 

Therefore, the series is 1,2, 6 , x, y. 

— 1 + 2 + 6 +x+y 

Now Mean* = 4.4=--- 

or 22 = 9 + x + y 

Therefore x+y= 13 ... (1) 

1 5 2 

Also variance = 8.24 = “ \( x ; ~ x ) 

n , = i 


i.e. 


8.24 = 


(3.4) 2 +(2.4) 2 +(1.6) 2 +x 2 +/ 


2x4.4(x + y) + 2x(4.4 ) 2 


or 41.20 = 11.56+ 5.76 +2.56 +x 2 +y 2 - 8.8 x 13 + 38.72 

Therefore x 2 +y 2 =97 ■••(2) 


But from (1), we have 

x 2 + y 2 + 2xy =169 ...(3) 

From (2) and (3), we have 

2xy = 72 ... (4) 

Subtracting (4) from (2), we get 

x 2 + y 2 - 2xy = 97-72 i.e. (x - y) 2 = 25 
or x-y = ±5 ... (5) 

So, from (1) and (5), we get 

x = 9, y = 4 when x-y = 5 
or x = 4, y = 9 when x-y = - 5 
Thus, the remaining observations are 4 and 9. 


Example 18 If each of the observation x p x 2 , ...,x n is increased by ‘a where a is a 
negative or positive number, show that the variance remains unchanged. 
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Let x be the mean of jc , x 2 , ...,x n . Then the variance is given by 

CTj 2 = “ZO Xi-X ) 2 
n i= i 

If ‘a is added to each observation, the new observations will be 

y. = x i + a ... (1) 

Let the mean of the new observations be y . Then 

1 n 1 n 

y= + 


i=l 


2 ><+ 2 > 
i=l i =1 


1 -A na _ 

= — >x ; . +—=x + a 
n n 


i.e. y = x + a 

Thus, the variance of the new observations 


••• ( 2 ) 


1 n Y n 

2 2 =-ZOi- t) 2 =-X^ +a_ x-°y [Using (1) and (2)] 

n i =1 n i=l 


ft i =1 


Thus, the variance of the new observations is same as that of the original observations. 


We may note that adding (or subtracting) a positive number to (or from) 
each observation of a group does not affect the variance. 


The mean and standard deviation of 100 observations were calculated as 
40 and 5.1, respectively by a student who took by mistake 50 instead of 40 for one 
observation. What are the correct mean and standard deviation? 

Solution Given that number of observations (n) = 100 
Incorrect mean (J) = 40, 

Incorrect standard deviation (a) = 5.1 

_ 1 ^ 


We know that 


x=-^ x i 

ft t =i 


100 


100 


i.e. 


40 =-Vjc, 

100 ^ 


i or 


Z x i = 4000 















i.e. 

Thus 


Hence 


Also 


i.e. 

or 

Therefore 

Now 


Therefore 
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Incorrect sum of observations = 4000 
the correct sum of observations = Incorrect sum -50 + 40 

= 4000-50 + 40 = 3990 


correct sum 3990 

Correct mean = -——- - —— = 39.9 


100 


100 


f- 

n 

1 

2 

( n 

\ 


Z* 


n 

1=1 

n 

V 1=1 

J 


i 1 ic -w 

n i= i 


51 = j-x Incorrect V x.~ — (40)“ 

V100 *“ ' 


26.01 = — x Incorrect _ 1600 

n 

Incorrect = 100 (26.01 + 1600) = 162601 
1=1 

n n 

Correct Z T = Incorrect Z, x >: -(50) 2 +(40) 2 

i=i i=i 

= 162601 -2500+ 1600= 161701 
Correct standard deviation 


(Correct ^ x~ 


- (Correct mean)' 


161701 

100 


- - (39.9) 


^1617.01 -1592.01 = ^25 = 5 
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Miscellaneous Exercise On Chapter 15 

The mean and variance of eight observations are 9 and 9.25, respectively. If six 
of the observations are 6,7,10,12,12 and 13, find the remaining two observations. 

2 The mean and variance of 7 observations are 8 and 16, respectively. If five of the 
observations are 2, 4, 10, 12, 14. Find the remaining two observations. 

3 The mean and standard deviation of six observations are 8 and 4, respectively. If 
each observation is multiplied by 3, find the new mean and new standard deviation 
of the resulting observations. 

Given that J is the mean and a 2 is the variance of n observations jc , x , ...pc . 
Prove that the mean and variance of the observations ax,, ax,, ax,, ...., ax are 

l 7 2 7 3 7 7 n 

a x and a 2 a 2 , respectively, (a 0). 

5 The mean and standard deviation of 20 observations are found to be 10 and 2, 
respectively. On rechecking, it was found that an observation 8 was incorrect. 
Calculate the correct mean and standard deviation in each of the following cases: 
(i) If wrong item is omitted. (ii) If it is replaced by 12. 

6 . The mean and standard deviation of marks obtained by 50 students of a class in 
three subjects, Mathematics, Physics and Chemistry are given below: 


Subject 

Mathematics 

Physics 

Chemistry 

Mean 

42 

32 

40.9 

Standard 

deviation 

12 

15 

20 


which of the three subjects shows the highest variability in marks and which 
shows the lowest? 

7 The mean and standard deviation of a group of 100 observations were found to 
be 20 and 3, respectively. Later on it was found that three observations were 
incorrect, which were recorded as 21, 21 and 18. Find the mean and standard 
deviation if the incorrect observations are omitted. 


Summary 


Measures of dispersion Range, Quartile deviation, mean deviation, variance, 
standard deviation are measures of dispersion. 

Range = Maximum Value - Minimum Value 

Mean deviation for ungrouped data 


M.D.Qc) =■ 


be, — jc 

Tlx.-Ml 


M.D. ' 1 
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Mean deviation for grouped data 


Y./; x,..-. 

M.D.(x) ' 


M.D. (M): 




N N 

Variance and standard deviation for ungrouped data 


, where N = Z f t 


<j 2 =- Zz _x ) 2 


Variance and standard deviation of a discrete frequency distribution 


1 


a2 =^llfi(. x i- x ) > ° = \fcZ^( x ;- x ) 


1 


_\2 


N‘ 


Variance and standard deviation of a continuous frequency distribution 

0-2 Z^( x /-^) 2 ’ ^ ^ ^NZ^-tZ^) 2 

Shortcut method to find variance and standard deviation. 


2 r 


N 2 L 


where y t 


NZ/Z-(Z.z «) 2 

X; - A 
h 


h 

<7 = — 

N 


nZZ'MZZ',-) 


Coefficient of variation (C.V.) = — xlOO, x^O. 

x 

For series with equal means, the series with lesser standard deviation is more consistent 
or less scattered. 


Historical Note 

‘Statistics’ is derived from the Latin word ‘status’ which means a political 
state. This suggests that statistics is as old as human civilisation. In the year 3050 
B.C., perhaps the first census was held in Egypt. In India also, about 2000 years 
ago, we had an efficient system of collecting administrative statistics, particularly, 
during the regime of Chandra Gupta Maurya (324-300 B.C.). The system of 
collecting data related to births and deaths is mentioned in Kautilya’s Arthshastra 
(around 300 B.C.) A detailed account of administrative surveys conducted during 
Akbar’s regime is given in Ain-I-Akbari written by Abul Fazl. 
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Captain John Graunt of London (1620-1674) is known as father of vital 
statistics due to his studies on statistics of births and deaths. Jacob Bernoulli 
(1654-1705) stated the Law of Large numbers in his book “Ars Conjectandi’, 
published in 1713. 

The theoretical development of statistics came during the mid seventeenth 
century and continued after that with the introduction of theory of games and 
chance (i.e., probability). Francis Gabon (1822-1921), an Englishman, pioneered 
the use of statistical methods, in the field of Biometry. Karl Pearson (1857-1936) 
contributed a lot to the development of statistical studies with his discovery 
of Chi square test and foundation of statistical laboratory in England (1911). 
Sir Ronald A. Fisher (1890-1962), known as the Father of modem statistics, 
applied it to various diversified fields such as Genetics, Biometry, Education, 
Agriculture, etc. 







